Improvements in dbt Cloud CI for enhanced data quality and cost efficiency from Coalesce 2023
Grace Goheen, Product Manager at dbt Labs, explains the benefits of having continuous integration (CI) in dbt Cloud.
"With all of these product improvements, we believe that dbt Cloud CI is the best way to ensure code quality at scale."
Grace Goheen, Product Manager at dbt Labs, explains Continuous integration (CI) in dbt Cloud and how it can improve code quality and confidence in data teams. Grace also covers recent improvements made to the platform, and how it can be customized to suit specific team needs.
dbt Cloud CI integrates seamlessly with Git providers
"CI–in long-form, ‘continuous integration’–is where you set up automated runs and tests of your codebase before you merge your changes into production."
Grace explains how dbt Cloud CI integrates with Git providers, enhancing developer workflows and simplifying the management of code changes. She also elaborates that dbt Cloud CI automatically configures all default components when setting up a new job.
"dbt cloud integrates with your Git provider, enabling you to trigger a CI check every time a developer on your team opens up a pull request.” This integration ensures that all developers' changes are validated consistently.
Graces also highlights the feature that allows CI checks to be triggered via API calls, an option for teams using Git providers that don't have a native integration with dbt Cloud. However, she notes that, in such cases, the teams would have to handle dropping temporary schemas independently.
dbt Cloud's CI enhances the quality, reliability, and efficiency of data teams
dbt Cloud CI can improve the quality of data teams' work. Grace shares her own experience of accidentally introducing breaking changes to her team's production environment–something that’s far too common among data teams.
"I was only concerned with the data assets that I was working on and never considered how my changes might affect the rest of our dbt project," she admits. To combat this issue, Grace suggests implementing dbt Cloud's CI, which she describes as "the number one, best way to reduce the risk of introducing breaking changes into your production environment."
Grace adds, "dbt Cloud CI is automatic. dbt Cloud CI is safe. dbt Cloud CI is temporary. And dbt Cloud CI is slim." This means that every time a developer opens up a pull request, an automatic check is triggered that builds only the data assets relevant to the code changes, ensuring that no breaking changes are introduced into the production environment. Its temporary nature also ensures that no additional clutter is added to your data warehouse.
dbt Cloud's CI is built for scalability and customization
Grace highlights the scalability and customization features of dbt Cloud's CI. She explains the improvements made to dbt Cloud CI to enhance the setup experience and scalability of the offering. This makes it usable for any team size and adaptable to specific project needs.
"We now support concurrent CI checks so an unlimited number of CI checks can run at once in dbt Cloud. So, if 10 developers on your team all open PRs at the exact same time, 10 CI checks will run in parallel in dbt Cloud," she explains.
There's also support for draft pull requests, allowing code changes to be validated even before opening a pull request for review. Additionally, Grace emphasizes the flexibility of CI checks. Developers can choose to process only a subset of their data, employ the warn error flag, clone pre-existing incremental models, or even enforce alignment with dbt best practices using the dbt project evaluator.
Grace’s key insights on dbt Cloud CI
- dbt Cloud Continuous Integration (CI) is an important practice in software engineering that can also be applied to analytics engineering to reduce the risk of introducing breaking changes into production
- dbt Cloud CI is automatic, safe, temporary, and slim, running checks only on data assets relevant to code changes
- dbt Cloud CI has been improved to be more intuitive, scalable, and able to support concurrent CI checks–automatically canceling redundant CI runs and supporting draft pull requests
- There are ways to customize CI checks in dbt Cloud to suit specific team needs, including processing only a subset of data, promoting dbt warnings to errors, cloning pre-existing incremental models, and enforcing alignment with dbt best practices
Automating CI/CD in dbt Cloud: Sunrun's story
Does a two-step deployment workflow for developing, testing, and deploying code to dbt Cloud sound ...
Automation in dbt for large-scale operations from Coalesce 2023
Benoit Perigaud, Staff Analytics Engineer at dbt Labs, explains how to avoid chaos when scaling dbt projects.